40 research outputs found
Recommended from our members
Prediction of microbial communities for urban metagenomics using neural network approach.
BACKGROUND:Microbes are greatly associated with human health and disease, especially in densely populated cities. It is essential to understand the microbial ecosystem in an urban environment for cities to monitor the transmission of infectious diseases and detect potentially urgent threats. To achieve this goal, the DNA sample collection and analysis have been conducted at subway stations in major cities. However, city-scale sampling with the fine-grained geo-spatial resolution is expensive and laborious. In this paper, we introduce MetaMLAnn, a neural network based approach to infer microbial communities at unsampled locations given information reflecting different factors, including subway line networks, sampling material types, and microbial composition patterns. RESULTS:We evaluate the effectiveness of MetaMLAnn based on the public metagenomics dataset collected from multiple locations in the New York and Boston subway systems. The experimental results suggest that MetaMLAnn consistently performs better than other five conventional classifiers under different taxonomic ranks. At genus level, MetaMLAnn can achieve F1 scores of 0.63 and 0.72 on the New York and the Boston datasets, respectively. CONCLUSIONS:By exploiting heterogeneous features, MetaMLAnn captures the hidden interactions between microbial compositions and the urban environment, which enables precise predictions of microbial communities at unmeasured locations
InfluencerRank: Discovering Effective Influencers via Graph Convolutional Attentive Recurrent Neural Networks
As influencers play considerable roles in social media marketing, companies
increase the budget for influencer marketing. Hiring effective influencers is
crucial in social influencer marketing, but it is challenging to find the right
influencers among hundreds of millions of social media users. In this paper, we
propose InfluencerRank that ranks influencers by their effectiveness based on
their posting behaviors and social relations over time. To represent the
posting behaviors and social relations, the graph convolutional neural networks
are applied to model influencers with heterogeneous networks during different
historical periods. By learning the network structure with the embedded node
features, InfluencerRank can derive informative representations for influencers
at each period. An attentive recurrent neural network finally distinguishes
highly effective influencers from other influencers by capturing the knowledge
of the dynamics of influencer representations over time. Extensive experiments
have been conducted on an Instagram dataset that consists of 18,397 influencers
with their 2,952,075 posts published within 12 months. The experimental results
demonstrate that InfluencerRank outperforms existing baseline methods. An
in-depth analysis further reveals that all of our proposed features and model
components are beneficial to discover effective influencers.Comment: ICWSM 202
SLC2A10 genetic polymorphism predicts development of peripheral arterial disease in patients with type 2 diabetes. SLC2A10 and PAD in type 2 diabetes
<p>Abstract</p> <p>Background</p> <p>Recent data indicate that loss-of-function mutation in the gene encoding the facilitative glucose transporter GLUT10 (<it>SLC2A10</it>) causes arterial tortuosity syndrome via upregulation of the TGF-β pathway in the arterial wall, a mechanism possibly causing vascular changes in diabetes.</p> <p>Methods</p> <p>We genotyped 10 single nucleotide polymorphisms and one microsatellite spanning 34 kb across the <it>SLC2A10 </it>gene in a prospective cohort of 372 diabetic patients. Their association with the development of peripheral arterial disease (PAD) in type 2 diabetic patients was analyzed.</p> <p>Results</p> <p>At baseline, several common SNPs of <it>SLC2A10 </it>gene were associated with PAD in type 2 diabetic patients. A common haplotype was associated with higher risk of PAD in type 2 diabetic patients (haplotype frequency: 6.3%, <it>P </it>= 0.03; odds ratio [OR]: 14.5; 95% confidence interval [CI]: 1.3- 160.7) at baseline. Over an average follow-up period of 5.7 years, carriers with the risk-conferring haplotype were more likely to develop PAD (<it>P </it>= 0.007; hazard ratio: 6.78; 95% CI: 1.66- 27.6) than were non-carriers. These associations remained significant after adjustment for other risk factors of PAD.</p> <p>Conclusion</p> <p>Our data demonstrate that genetic polymorphism of the <it>SLC2A10 </it>gene is an independent risk factor for PAD in type 2 diabetes.</p
Recommended from our members
Multi-scale Human Behavior Modeling with Heterogeneous Data
In this era of big data, massive data are generated from heterogeneous resources every day, which provides an unprecedented opportunity for deepening our understanding of complex human behaviors. Modeling human behaviors requires robust computational methods that can not only capture semantics and useful insights from sparse and heterogeneous data, but also unravel sophisticated human behaviors at different scales. In addition, the enormous data velocity and the unparalleled scale of deep models also pose significant challenges to efficiency. In this dissertation, we demonstrate a collection of research results that systematically improve the ecosystem of human behavior modeling based on representation learning. For heterogeneous data in various settings, we present practical representation learning methods to effectively and efficiently capture their semantics. Moreover, these representation learning methods can actually fill a niche to comfortably model different behaviors with atomic, compositional, and explainable operations, thereby modeling human behaviors at different scales.As a result, our proposed approaches not only address various real-world challenges in diverse domains, but also present the potentials to adopt valuable domain knowledge into machine learning
Improving Ranking Consistency for Web Search by Leveraging a Knowledge Base and Search Logs
本論文提出了一個創新的概念-在網頁搜尋中的排序一致性 (ranking consistency in web search) 。相關排序 (relevance ranking) 是在創建一個有效的網頁搜尋系統時會碰到最大的問題之一。給定一些具有相似搜尋意圖 (search intents) 的查詢 (queries) ,常見的作法是將個別的查詢分別去優化排序模型 (ranking models) 。因此,在現代的搜尋引擎中,會有不一致的排序結果。但我們預期具有相似意圖的查詢應該保有排序一致性。本論文的目的在於為了提升網頁搜尋的相關排序,而學習搜尋結果的排序一致性。藉由利用知識庫 (knowledge base)與搜尋紀錄 (search logs) ,我們提出了一個同時提升相關排序與排序一致性的重新排序模型 (re-ranking model) 。據我們所知,本論文提出了第一個藉由提升排序一致性來提升相關排序效能的解法。實驗的結果也顯示出我們所提出的方法顯著地提升了相關排序以及排序一致性。兩個在群眾發包平台 (crowd-sourcing platform) Amazon Mechanical Turk 所進行的調查也顯示出使用者對於排序一致性相當敏感,也比較喜愛由我們所提出之方法所得到較為一致的排序結果。In this paper, we propose a new idea called ranking consistency in web search. Relevance ranking is one of the biggest problems in creating an effective web search system. Given some queries with similar search intents, conventional approaches typically only optimize ranking models by each query separately. Hence, there are inconsistent rankings in modern search engines. It is expected that the search results of different queries with similar search intents should preserve ranking consistency. The aim of this paper is to learn consistent rankings in search results for improving the relevance ranking in web search. We then propose a re-ranking model aiming to simultaneously improve relevance ranking and ranking consistency by leveraging knowledge bases and search logs. To the best of our knowledge, our work offers the first solution to improving relevance rankings with ranking consistency. Extensive experiments have been conducted using the Freebase knowledge base and the large-scale query-log of a commercial search engine. The experimental results show that our approach significantly improves relevance ranking and ranking consistency. Two user surveys on Amazon Mechanical Turk also show that users are sensitive and prefer the consistent ranking results generated by our model
Analyzing Social Event Participants for a Single Organizer
Online social networking services allow people to initialize various kinds of offline social events (e.g., cocktail parties, group buying, and study groups), and invite friends or strangers to participate the events in either manual or collaborative manners. However, such invitation manners are tediously long, and irrelevant, uninterested and even spammers can unexpectedly be added into the event. In this paper, we aim at investigating the characteristics of social events participants for a specific organizer. Specifically, we are wondering how social network, user profiles and geo-locations affect user participation when the social event is held by a single organizer. An extensive analysis has been conducted on the real-world event-based social network Meetup dataset. The results of data analysis also demonstrate that these factors actually influence users' event participation